Search CORE

21 research outputs found

Exploiting Context-Dependent Quality Metadata for Linked Data Source Selection

Author: Yaman Beyza
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 23/05/2018
Field of study

The traditional Web is evolving into the Web of Data which consists of huge collections of structured data over poorly controlled distributed data sources. Live queries are needed to get current information out of this global data space. In live query processing, source selection deserves attention since it allows us to identify the sources which might likely contain the relevant data. The thesis proposes a source selection technique in the context of live query processing on Linked Open Data, which takes into account the context of the request and the quality of data contained in the sources to enhance the relevance (since the context enables a better interpretation of the request) and the quality of the answers (which will be obtained by processing the request on the selected sources). Specifically, the thesis proposes an extension of the QTree indexing structure that had been proposed as a data summary to support source selection based on source content, to take into account quality and contextual information. With reference to a specific case study, the thesis also contributes an approach, relying on the Luzzu framework, to assess the quality of a source with respect to for a given context (according to different quality dimensions). An experimental evaluation of the proposed techniques is also provide

Archivio istituzionale della ricerca - Università di Genova

Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques

Author: Freudenberg Markus
Pasin Michele
Yaman Beyza
Publication venue: OASIcs - OpenAccess Series in Informatics. 2nd Conference on Language, Data and Knowledge (LDK 2019)
Publication date: 01/01/2019
Field of study

In recent years we have seen a proliferation of Linked Open Data (LOD) compliant datasets becoming available on the web, leading to an increased number of opportunities for data consumers to build smarter applications which integrate data coming from disparate sources. However, often the integration is not easily achievable since it requires discovering and expressing associations across heterogeneous data sets. The goal of this work is to increase the discoverability and reusability of the scholarly data by integrating them to highly interlinked datasets in the LOD cloud. In order to do so we applied techniques that a) improve the identity resolution across these two sources using Link Discovery for the structured data (i.e. by annotating Springer Nature (SN) SciGraph entities with links to DBpedia entities), and b) enriching SN SciGraph unstructured text content (document abstracts) with links to DBpedia entities using Named Entity Recognition (NER). We published the results of this work using standard vocabularies and provided an interactive exploration tool which presents the discovered links w.r.t. the breadth and depth of the DBpedia classes

Dagstuhl Research Online Publication Server

Context-Dependent Quality-Aware Source Selection for Live Queries on Linked Data

Author: CATANIA BARBARA
GUERRINI GIOVANNA
YAMAN BEYZA
Publication venue: OpenProceedings.org
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Università di Genova

Context Aware Source Selection for Linked Data

Author: Catania Barbara
Guerrini Giovanna
Yaman Beyza
Publication venue: CEUR-WS
Publication date: 01/01/2018
Field of study

The traditional Web is evolving into the Web of Data, which gathers huge collections of structured data over distributed, heterogeneous data sources. Live queries are needed to get current information out of this global data space. In live query processing, source selection allows the identification of the sources that most likely contain relevant content. Due to the semantic heterogeneity of the Web of Data, however, it is not always easy to assess relevancy. Context information might help in interpreting the user\u2019s information needs. In this paper, we discuss how context information can be exploited to improve source selection

Archivio istituzionale della ricerca - Università di Genova

LinkedDataOps: linked data operations based on quality process cycle

Author: Brennan Rob
Yaman Beyza
Publication venue: CEUR-WS
Publication date: 17/09/2020
Field of study

This paper describes three new Geospatial Linked Data (GLD) quality metrics that help evaluate conformance to standards. Standards conformance is a key quality criteria, for example for FAIR data. The metrics were implemented in the open source Luzzu quality assessment framework and used to evaluate four public geospatial datasets that showed a wide variation in standards conformance. This is the first set of Linked Data quality metrics developed specifically for GLD

DCU Online Research Access Service

Distribution and occurrence of microsporidian pathogens of the willow flea beetle, Crepidodera aurata (Coleoptera: Chrysomelidae) in North Turkey

Author: Algi Gönül
Güner Beyza
Yaman Mustafa
Ünal Sabri
Publication venue: 'Entomologica Fennica'
Publication date: 10/12/2015
Field of study

In this study, microsporidian pathogens in Crepidodera aurata populations were investigated. Totally 1,728 C. aurata adults were examined for microsporidian pathogens and 78 of them were found to be infected. Two species of microsporidia; Microsporidium sp.1 and Microsporidium sp.2 were observed in the C. aurata populations from ten localities in North Turkey. They show considerable difference from each other in the spore morphology and dimension, infection rate and host locality. The spores of Microsporidium sp.1 were oval in shape and measured from 3.66 to 5.66 µm in length and from 1.35 to 2.22 µm in width (n=50). The spores of Microsporidium sp. 2 were slightly curled and measured from 2.44 to 3.55 µm in length and from 1.25 to 1.55 µm in width (n=50). These microsporidia were recorded from C. aurata for the first time. Here we present occurrence and distribution of two microsporidia in C. aurata populations as natural potentially suppressing factors

Journal.fi

Quality metrics to measure the standards conformance of geospatial linked data

Author: Brennan Rob
Thompson Kevin
Yaman Beyza
Publication venue
Publication date: 01/10/2020
Field of study

DCU Online Research Access Service

A SKOS taxonomy of the UN global geospatial information management data theme

Author: Brennan Rob
Thompson Kevin
Yaman Beyza
Publication venue: GeoLD2021
Publication date: 01/01/2021
Field of study

Complex data domains increase the difficulty of structuring, sharing, discovering and governing information. For the geospatial domain common models such as INSPIRE have been established in the European Union. The United Nations initiative on Global Geospatial Information Management (UN-GGIM) draws together national and regional capacities. Interoperability is the main principle behind these initiatives. Nonetheless there is a lack of published research to date on mapping agency geospatial linked data leveraging the UN-GGIM taxonomy of information management data themes. Thus, we have identified use cases and defined a Simple Knowledge Organization System (SKOS)\footnote{\url{https://www.w3.org/TR/skos-reference/}} taxonomy expressing the UN GGIM data themes for national spatial infrastructure. This has been applied in a metadata generation and reporting tool for Ordnance Survey Ireland (OSi) which underpinned improved governance and reporting infrastructure in OSi. This demonstrated the contribution of Semantic Web technology to spatial data governance as well as its importance for data publishing. This paper presents a documented open license SKOS taxonomy for the UN GGIM data themes that follows Linked Data best practices. It provides a set of three use cases, an overview of UN-GGIM theme definitions and an example application of the taxonomy for deployment in OSi for DCAT metadata generation and data publishing pipeline reporting

DCU Online Research Access Service

Data quality and patient characteristics in European ANCA-associated vasculitis registries: data retrieval by federated querying

Author: Aslett Louis
Basu Neil
Dradin François
Gisslander Karl
Hederman Lucy
Hruskova Zdenka
Kardaoui Hicham
Lamprecht Peter
Lichołai Sabina
Little Mark A
Mohammad Aladdin J
Musial Jacek
O’Sullivan Declan
Puechal Xavier
Rutherford Matthew
Scott Jennifer
Segelmark Mårten
Straka Richard
Terrier Benjamin
Tesar Vladimir
Tesi Michelangelo
Vaglio Augusto
Wandrei Dagmar
White Arthur
Wójcik Krzysztof
Yaman Beyza
Publication venue: BMJ Publishing Group
Publication date: 31/10/2023
Field of study

Objectives This study aims to describe the data structure and harmonisation process, explore data quality and define characteristics, treatment, and outcomes of patients across six federated antineutrophil cytoplasmic antibody-associated vasculitis (AAV) registries.Methods Through creation of the vasculitis-specific Findable, Accessible, Interoperable, Reusable, VASCulitis ontology, we harmonised the registries and enabled semantic interoperability. We assessed data quality across the domains of uniqueness, consistency, completeness and correctness. Aggregated data were retrieved using the semantic query language SPARQL Protocol and Resource Description Framework Query Language (SPARQL) and outcome rates were assessed through random effects meta-analysis.Results A total of 5282 cases of AAV were identified. Uniqueness and data-type consistency were 100% across all assessed variables. Completeness and correctness varied from 49%–100% to 60%–100%, respectively. There were 2754 (52.1%) cases classified as granulomatosis with polyangiitis (GPA), 1580 (29.9%) as microscopic polyangiitis and 937 (17.7%) as eosinophilic GPA. The pattern of organ involvement included: lung in 3281 (65.1%), ear-nose-throat in 2860 (56.7%) and kidney in 2534 (50.2%). Intravenous cyclophosphamide was used as remission induction therapy in 982 (50.7%), rituximab in 505 (17.7%) and pulsed intravenous glucocorticoid use was highly variable (11%–91%). Overall mortality and incidence rates of end-stage kidney disease were 28.8 (95% CI 19.7 to 42.2) and 24.8 (95% CI 19.7 to 31.1) per 1000 patient-years, respectively.Conclusions In the largest reported AAV cohort-study, we federated patient registries using semantic web technologies and highlighted concerns about data quality. The comparison of patient characteristics, treatment and outcomes was hampered by heterogeneous recruitment settings

Durham Research Online

Enlighten

Jagiellonian Univeristy Repository

Recommended from our members

Results of the ontology alignment evaluation initiative 2020

Author: Algergawy Alsayed
Amini Reihaneh
Faria Daniel
Fundulaki Irini
Harrow Ian
Hertling Sven
Hitzler Pascal
Jiménez-Ruiz Ernesto
Jonquet Clement
Karam Naouel
Khiat Abderrahmane
Laadhar Amir
Laadhar Amir
Lambrix Patrick
Li Huanyu
Li Ying
Paulheim Heiko
Pesquita Catia
Pour Mina Abd Nikooie
Saveta Tzanina
Shvaiko Pavel
Splendiani Andrea
Thiéblin Élodie
Trojahn Cassia
Vataščinová Jana
Yaman Beyza
Zamazal Ondřej
Zhou Lu
Publication venue: CEUR-WS
Publication date: 01/01/2020
Field of study

The Ontology Alignment Evaluation Initiative (OAEI) aims at comparing ontology matching systems on precisely defined test cases. These test cases can be based on ontologies of different levels of complexity and use different evaluation modalities (e.g., blind evaluation, open evaluation, or consensus). The OAEI 2020 campaign offered 12 tracks with 36 test cases, and was attended by 19 participants. This paper is an overall presentation of that campaign

City Research Online

Scientific Publications of the University of Toulouse II Le Mirail

MAnnheim DOCument Server